Ridge Regression in Prediction Problems: Automatic Choice of the Ridge Parameter

نویسندگان

  • Erika Cule
  • Maria De Iorio
چکیده

To date, numerous genetic variants have been identified as associated with diverse phenotypic traits. However, identified associations generally explain only a small proportion of trait heritability and the predictive power of models incorporating only known-associated variants has been small. Multiple regression is a popular framework in which to consider the joint effect of many genetic variants simultaneously. Ordinary multiple regression is seldom appropriate in the context of genetic data, due to the high dimensionality of the data and the correlation structure among the predictors. There has been a resurgence of interest in the use of penalised regression techniques to circumvent these difficulties. In this paper, we focus on ridge regression, a penalised regression approach that has been shown to offer good performance in multivariate prediction problems. One challenge in the application of ridge regression is the choice of the ridge parameter that controls the amount of shrinkage of the regression coefficients. We present a method to determine the ridge parameter based on the data, with the aim of good performance in high-dimensional prediction problems. We establish a theoretical justification for our approach, and demonstrate its performance on simulated genetic data and on a real data example. Fitting a ridge regression model to hundreds of thousands to millions of genetic variants simultaneously presents computational challenges. We have developed an R package, ridge, which addresses these issues. Ridge implements the automatic choice of ridge parameter presented in this paper, and is freely available from CRAN.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A semi-automatic method to guide the choice of ridge parameter in ridge regression

We consider the application of a popular penalised regression method, Ridge Regression, to data with very high dimensions and many more covariates than observations. Our motivation is the problem of out-of-sample prediction and the setting is high-density genotype data from a genome-wide association or resequencing study. Ridge regression has previously been shown to offer improved performance ...

متن کامل

Ridge regression in prediction problems: automatic choice of the ridge parameter Supporting Information

Table 1: Four simulation scenarios used in the evaluation of the bias-variance decomposition. The simulation scenarios are taken from Zou & Hastie (2005). scenario n p β Structure of X (1) 100 8 (3, 1.5, 0, 0, 2, 0, 0, 0) corr (i, j) = 0.5|i−j| (2) 100 8 0.85 for all j corr (i, j) = 0.5|i−j| (3) 50 40 βj = { 0 j = (1, . . . , 10, 21, . . . , 30) 1 j = (11, . . . , 20, 31, . . . , 40) corr (i, j...

متن کامل

Generalized Ridge Regression Estimator in Semiparametric Regression Models

In the context of ridge regression, the estimation of ridge (shrinkage) parameter plays an important role in analyzing data. Many efforts have been put to develop skills and methods of computing shrinkage estimators for different full-parametric ridge regression approaches, using eigenvalues. However, the estimation of shrinkage parameter is neglected for semiparametric regression models. The m...

متن کامل

Two-Parameters Fuzzy Ridge Regression with Crisp Input and Fuzzy Output

‎In this paper a new weighted fuzzy ridge regression method for a given set of crisp input and triangular fuzzy output values is proposed‎. ‎In this regard‎, ‎ridge estimator of fuzzy parameters is obtained for regression model and its prediction error is calculated by using the weighted fuzzy norm of crisp ridge coefficients‎. . ‎To evaluate the proposed regression model‎, ‎we introduce the fu...

متن کامل

Prediction of chronological age based on Demirjian dental age using robust ridge regression method

Introduction: Estimation of age has an important role in legal medicine, endocrine diseases and clinical dentistry. Correspondingly, evaluation of dental development stages is more valuable than tooth erosion. In this research, the modeling of calendar age has been done using new and rich statistical methods. Considerably, it can be considering as a practicable method in medical science that is...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 37  شماره 

صفحات  -

تاریخ انتشار 2013